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Abstract 

Proper folding of deeply knotted proteins has a very low success rate even in structure- 
based models which favor formation of the native contacts but have no topological bias. 
By employing a structure-based model, we demonstrate that cotranslational folding on a 
model ribosome may enhance the odds to form trefoil knots for protein YibK without any 
need to introduce any non-native contacts. The ribosome is represented by a repulsive 
wall that keeps elongating the protein. On-ribosome folding proceeds through a a slipknot 
conformation. We elucidate the mechanics and energetics of its formation. We show that 
the knotting probability in on-ribosome folding is a function of temperature and that there 
is an optimal temperature for the process. Our model often leads to the establishment of 
the native contacts without formation of the knot. 


1 Introduction 

There are several hundreds of knot-containing structures mmm in the Protein Data Bank 
(PDB) and their topology can be characterized primarily by how many intersections a 
backbone makes with itself on making its two-dimensional projection. The knot extends 
between two end points, ni and n 2 > ni, along the sequence. The knot ends are defined 
operationally through systematic cutting-away of the amino acids from both termini until 
the knot disintegrates The last site that still supports the knot is its end point. If ni 

and 712 are both distant from the termini, the knot is considered to be deep; otherwise it is 
called shallow. One needs closed lines to declare existence of a knot on a line with certainty. 
Backbones of proteins are generally not closed but for deep knots the determination of a 
knot-type is quite reliable. 

Stretching of a knotted protein either at constant speed mmu or at constant force 
[9] results in a step-wise knot tightening process in which the knot ends jump. Here, 
we consider two other conformation changing processes m in knotted proteins; folding 
and unfolding through heating. Folding from an extended state to a knotted native 
conformation of protein YibK has been reported mm to be difficult with, at best, 
1 - 2% success rate even if one uses a coarse-grained structure-based model. Here, we 
examine whether nascent conditions, such that exist when a protein is formed by the 
ribosome [laniiiaiie], can help in establishing the native knot in proteins. We propose 
a simple generic model in which the ribosome is represented as an infinite repulsive plate 
which spawns proteins. Our model is an oversimplification of the real geometry of the 
ribosome - the peptide chain is formed in the ribosome tunnel and it undergoes folding, 
at least partially, within it. The tunnel has a diameter that varies between 10 and 20 A 
and the largest cavity in the tunnel cannot encompass a sphere with a radius larger than 
9.5 A [HTIITTI ITK]. This geometry provides stronger confinement than that induced by the 
plane. Nevertheless, the model with the plane is the simplest one that introduces new 
qualtitative features that are brought in by confinement. 
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Here, our focus is on YibK [20] from Haemophilus influenze which contains a deep 
trefoil knot with three intersections. The corresponding PDB code is 1J85. (We shall 
refer to proteins through their PDB structure codes.) This protein is probably the most 
frequently cited example of a deeply knotted protein mm- Its radius of gyration is about 
15 A. We find that the nascent conditions do help in folding 1J85 - they actually enable it 
- and the effect is observed only within a range of temperatures that provide optimality. 
We show that invoking non-native contacts is not necessary to generate the on-ribosome 
slipknot, just one needs to employ a proper procedure to define contacts that are declared 
native. We identify the slipknot based mechanism of folding and explain why the model 
ribosome favors its formation. We also study the energetics involved in the emergence of 
the slipknot. Unlike the claims of ref. [TI] (that uses the same model), we do not observe 
the slipknot in the absence of the ribosome. 


2 Structure-based modeling 

The justification and details of implementation of our model are explained in refs. |22l 
Its character is structure-based or, equivalently. Go-like [21], and the molecular 
dynamics deals only with the a-C atoms. The bonded interactions are described by the 
harmonic potentials. The list of pairs of amino acids that are considered to be in contact 
(i.e. interacting) in the native state is known as the contact map. These contacts are 
described by the Lennard-Jones potentials with the minima at the crystallographically 
determined distances. The potentials are identical in depth, denoted as e. The value 
of e has been calibrated by making comparisons to the experimental data on stretching: 
approximately, e/A is 110 pN (which also is close to the energy of the 0-H-N hydrogen 
bond of 1.65 kcal/mol). Non-native contacts are considered repulsive. 

The contact map itself is obtained by using the overlap criterion in which the heavy 
atoms in the native conformation are represented by enlarged van der Waals spheres 
[221125j : if at least one pair of such spheres placed on different amino acids overlap (OV) 
then there is the native contact. An alternative way to establish a contact map is by 
invoking considerations that are more chemical in nature, as available at the CSU server 
[26] . The CSU contacts may either be specific (like hydrogen bonds) or non-specific 
(presumably the much weaker dispersive interactions). The CSU and OV maps have 
many contacts in common but are not identical. Thus, in addition to the OV map, 
we also consider OV-CSU map in which we augment CSU by the missing specific CSU 
contacts. 

The backbone stiffness is accounted for by the chirality potential [22]. The simulations 
are done at various temperatures. For most proteins without any knots, optimal folding 
takes place around at T = 0.3 e/ks, which should correspond to a vicinity of the room 
temperature (ks is the Boltzmann constant). We generate at least 200 trajectories for 
each temperature considered. The time unit of the simulations, r, is effectively of order 
1 ns as the motion of the atoms is dominated by diffusion instead of being ballistic. 

Folding is usually declared when all native contacts are established for the first time. 
For knotted proteins, however, this condition does not necessarily signify that the correct 
native knot has been formed. 

There are two important aspect of the role of the ribosome in the context of nascent 
folding. The first is that folding of a protein is concurrent with its birth. Since the 
mRNA is translated from 5’ to 3’, the proteins are synthesized from the N terminus to 
the C terminus. The time interval between the emergence of two successive a-C atoms 
will be denoted by The second aspect is that the surface of the ribosome provides 
excluded volume and, therefore, reduces the conformational entropy. Both aspects can 
be captured by a model in which the ribosome is represented by an infinite plate which 
gives birth to a protein at one fixed location. We take the plate to generate a laterally 
uniform potential of the form e (^)®, where z denotes the distance away from the 
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plate and uq = 4 x 2“^/® A. This form of the potential comes from integrating the energy 
of interaction between a Lennard-Jones particle and a semi-infinite continuum below z=0 
and discarding the attractive part. A coarse-grained model of cotranslational folding with 
molecularly sculpted ribosome has been proposed by Elcock m- We have adopted a less 
sophisticated model in order to enable simulations of hundreds of trajectories that last 
long - formation of a deep knot is a rare event and making simplifications is necessary. 

3 Results 

The PDB structure file of 1J85 provides coordinates of 156 residues. About half of them 
are arranged into seven a-helices and eight /3-strands [20] . The native conformation of 
1J85 is shown in panel A of Fig. [TJ Segments 1-74, 75-95, 96-120, and 121-156 are 
shown in green, red, green, and purple respectively. The reason for this color convention 
is that in order to form the knot in (at least) cotranslational folding, the purple segment 
(121-141) must form a slip-loop that would go through the red knot-loop (see the defining 
drawings in ref. |28]1. The knot ends are at LEU-75 and LYS-119 in the native state - 
a separation of 44 sites. On stretching, the knot tightens up and the ultimate separation 
between the knot ends becomes 10 BM- 

3.1 Equilibrium properties and thermal unfolding 

In order to set the stage, we first consider a situation in which 1J85 is set in the native 
state and then undergoes time evolution at various temperatures. At any finite T, some 
number of contacts break down (the distance between the a-C atoms in residues that make 
a contact becomes larger than 1.5 of the native distance) and some get restored. The top 
panel of Fig. [2| shows the probability, Pq, of all native contacts being simultaneously 
established as a function of T. Similar to the lattice models of proteins |2S| (see also an 
exact analysis laDiEi]) one may define Tj as a temperature at which Pq crosses through 
4. For the OV contact map, Tf = 0.194 e/ks and for the OV-CSU contact map it is 
0.204 e/ks ~ just a small shift. For both of these contact maps, Pq is essentially close to 
zero at T = 0.3 e/ks (1% and 3% for OV and OV-CSU respectively). However, at this 
temperature, the fraction of the native contacts present, Q, is high (the middle panel of 
Fig. [2|). Q crosses ^ at 0.827 and 0.908 e/ks for OV and CSU-OV respectively. Folding 
is said to occur when all native contacts are established simultaneously and the kinetic 
optimality (typically around 0.3 - 0.35 e/ks) may take place where Pq is small. 

It should be noted that in ref. MTf is defined as corresponding to the temperature 
at which a half of the native bonds are present {i.e. around 0.8 e/ks) - a much more 
relaxed criterion compared to Pq crossing At this elevated temperature, substantially 
higher than the room temperature, the probability of maintaing the native conformation is 
simply zero. However, the temperature at which Q crosses ^ signals the onset of globular 
shapes in the model protein. The simulations reported on in ref. m could have been 
done around the Tj defined through Q = ^ as this is the only temperature mentioned in 
the text. 

The fluctuations in the contact occupation numbers may or may not affect the se¬ 
quential locations of the knot ends. Fig. [3| shows locations of the knot in examples of 
trajectories at several temperatures. At T = 0.3 e/Zcs, the knot ends stay fixed for at least 
20 000 000 r - a duration which is at least three orders of magnitude longer that optimal 
folding times of unknotted proteins within the same model [32|. At T = 1.00 e/ks, the 
knot ends stay put for a while and then diffuse out of the chain rapidly. At t = 0.95 e/ks 
we observe an intermittent behavior in which the knot disappears and is then restored. 
If there is any intermittency at T = 0.9 e/Zcs, then the recovery time is longer than the 
scale of the simulations. The bottom panel of Fig. [2] shows the median unfolding time 
defined as in ref. [33|, i.e. through the instant at which all contact that are sequentially 
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separated by more than I are broken simultaneously. Taking I of 4 (local helical contacts) 
was numerically unfeasible so we took I = 10. The statistics were based on 110 trajecto¬ 
ries. At T = 1.0 e/ks and below less than 50% trajectories led to folding and the median 
time could not be determined - with the cutoff of 30 000 r. 

3.2 Folding in unbounded space 

We now consider folding in an unbounded space when one starts from a random confor¬ 
mation which is nearly fully extended. Even though we use the same model as in ref. 
m, we do not find even one single trajectory that would lead to correct folding. How¬ 
ever, there were trajectories which led to the establishment of all native contacts in an 
unknotted way. We shall refer to such situations as corresponding to misfolding. One of 
them is shown in panel C of Fig. [TJ The lack of proper folding in 1J85 essentially agrees 
with the result of ref. [T^] (where 0.1% success rate is reported). We have generated 1201 
trajectories at 0.35 e/ks and between 200 and 314 trajectories at 0.1, 0.2 and 0.3 e/ks- 
The simulational cutoff time was 1 000 000 r. With this statistics, a 1 - 2% success rate 
claimed in ref. |llj (at unspecified temperatures) would mean observation of good folding 
in at least 12 trajectories. We speculate that the discrepancy may be due to the following 
factors a) possible biases in the initial conformations used in that reference, b) malfunc¬ 
tioning of the KMT algorithm (that may show knots when none are present), and c) the 
folding temperature range is narrower than 0.025 e/fes, i.e. the steps we considered, and 
the right T to fold was found accidentally. It should be noted that we have obtained a 
total of 338 misfolded trajectories (17% of all trajectories). 143 of these are at 0.35 e/ks 
and correspond to a mean folding time of 472 651 r. It is possible to mistake some of the 
misfolded states for the knotted ones. We find no folding nor misfolding at Tj defined 
by the condition of Q = ^. 

Not much is changed when one attempts to reduce the conformational entropy by 
anchoring the C terminus. For the total of 427 trajectories (with T between 0.1 and 
0.4 elks) 93 {i.e. 22%) established all native contacts without forming the knot. 

We have also considered a different Go-like model, along the lines of Clementi et al. 
|34j . In this model, the backbone stiffness is accounted for by the more common bond and 
dihedral angle potentials. The contact interactions are given by the 10-12 Lennard-Jones 
potentials, and we take the contact map to be given by OV (here we we removed the 
i,i + 2 and i -|- 3 contacts). For this model, Tf moves upward by about 0.2 e/ks and Pq 
is about 0.01 at 0.6 e/ks. At this T, only one out of 246 trajectories led to the correct 
folding with the proper knot (the folding time was 510 264 r). At temperatures 0.7 e/ks 
and 0.75 e/ks we get 1% and 7% of the correctly folded trajectories respectively and none 
at lower temperatures. This model, however, is not the one considered in ref. m- 

3.3 Folding on the ribosome 

The percentage-wise success, S, of reaching the properly knotted folded conformation in¬ 
creases substantially by simulating the process in the cotranslational way. The results are 
summarized in Fig. [H They depend on the value of but there appears to be a satura¬ 
tion in S around of 5 000 r. The experimental times of translation are certainly much 
longer than ~ 5 /is but this is not expected to affect the S. However, the corresponding 
S for misfolding (the inset in the lower panel) may reach saturation at a bigger 

The data for the evidently optimal 0.35e/A:s were obtained based on 500 trajectories 
for each that was considered. Same statistics were used for 0.325 and 0.375 e/ks and 
tw of 5 000 r. In other cases, there were at least 100 trajectories. The average combined 
time of folding under the optimal conditions with = 5000 r is 860353 r. Comparable 
times are at 0.325 and 0.375 e/ks- 

Interestingly, switching to the Clementi et a/.-like model generates no proper folding 
independent of whether the contact map is OV, CSU, or OV-CSU (in the temperature 
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range between 0.45 and 1.3 ejkB)- The backbone chain appears to be too stiff to allow 
for a knot in this model. 

The lower panel of Fig. 0] shows that the time evolution from extended states results 
in a substantial percentage of the misfolded conformations. This percentage gets boosted 
by the nascent conditions from 33% to 37% if the evolution takes place at 0.3e/A;s. At 
this T, there is no knotted folding. On the other hand, at the knot-optimal temperature 
of 0.35e/A:s the nascent conditions produce only 6.6% of the misfolded states. 

Fig. [5] shows five snapshots of an example of a successful folding trajectory in our 
basic model of 1J85. (A related movie is available as the Supplementary Material). The 
sequential fragments emerge from the model ribosome in order: first green, then red, 
green again, and finally purple. At stage A, there is no tertiary order yet. At stage B, 
the knot-loop segment (75-95) has left the model ribosome. At stage C, the knot-loop 
forms a nearly planar and nearly closed contour and residue 121 arrives at the plane of 
this contour. Another perspective on this stage is shown in Fig. [6] where the knot-loop is 
shaded in red. The C-terminal part (121-156, in purple) of the protein must drag through 
the knot-loop and the success rate depends on how well residue 121 is pinned to the plane 
of the knot-loop. There are eight OV-based contacts that residue 121 makes with the 
knot-loop, as shown in Fig. [71 However, the stabilization of the attachment is enhanced 
by the CSU-derived contact 88-121. A fully formed slipknot is shown in panel D of Fig. 
|S] and, in more details, in Fig. [51 At the very next stage E, the protein gets detached 
from the ribosome and becomes knotted because the C-terminal segment goes through 
the knot-loop. 

Our results confirm the picture proposed in ref. m that knotting in 1J85 is enabled 
by the slipknotting mechanism. However, we see it operational only when the protein is 
nascent. The wall facilitates formation of the C-terminal slipknot on the correct side of 
the knot-loop and it provides semi-confinement that allows for making repeated attempts 
to drag the slip-loop through the knot-loop. When one removes the wall but starts the 
evolution from the slipknot conformation, then - at optimality - 75% of trajectories lead 
to the knot formation. 

It is interesting to note that inclusion of the 88-121 contact, at T = O.Sbe/ks and for 
t^=5000 r, boosts S from 3.0% to 4.8% An inclusion of all missing CSU contacts (the 
OV-CSU contact map) boosts it even further to 6.2%. All of these contacts should be 
considered native. Wallin, et. al. [3S] have argued that non-native contacts are necessary 
to fold 1J85 to the knotted state. Specifically they added exponentially decaying non¬ 
native contacts between segments [86,93] and [122,147]. Their definition of a native contact 
is that two heavy atoms from different residues must fall within the distance of 4.5 A from 
one another. This procedure misses the 88-121 contact. In our contact map, we generate 
13 OV native contacts for the segments chosen by Wallin et al.. CSU adds 4 more and 
88-121 is the hfth one. Thus there is no need to invoke non-native contacts to explain 
folding in 1J85. 

3.4 The energetics of the slipknot formation during folding 
on the ribosome 

We have observed that the knotting process is very rapid. It takes quite a long time 
for a protein to get to state (a) shown in Fig. [9l in which the slip-loop is positioned 
just above the knot-loop. We expect that the specihc amino-acid arrangement leads to 
formation of a potential well and, therefore, emergence of a force that drags the slip-loop 
through the knot-loop. To prove this, we have monitored the potential energy associated 
with specihc amino acids from the slip-loop. The top panels in Fig. [9] show the energy 
experienced by one of the amino acids from the slip-loop, the 125th in the sequence, if 
found at various locations within the plane parallel to the plane of the model ribosome 
and crossing through this amino acid. The potential well is very localized. In state (a), its 
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depth is close to 0.50 e and in state (b), 200 r later, it increases to 1.0 e. In state (c), the 
slipknot is already created and the well becomes very shallow - 0.07 e - and the slipknot 
ceases to move forward any further. If at stage (a) the conformation of the knot-loop is 
such that the well is not sufficiently deep then no proper knotting takes place. 


3.5 Other deeply knotted proteins 

We have considered two other deeply knotted (hypothetical) proteins [19]: 106D from 
Thermatoga maritima and IVHO from Staphylococcus aureus. Both were analyzed pre¬ 
viously through stretching simulations [6]. We find no properly folded trajectories for 
106D and IVHO, either with or without the wall. However, we have obtained substantial 
percentages of the misfolding situations. For 106D, 41 out of 400 trajectories without the 
wall and 151 out of 400 with the wall resulted in misfolding. For IVHO, the corresponding 
numbers are 136 out of 500 and 203 out of 390. Thus the ribosome-imitating wall helps in 
folding through semi-confinement but not sufficiently enough to form the proper knots in 
a noticeable way. However, more sophisticated models of the ribosome might work better. 


4 Conclusions 

We have used the structure-based molecular dynamics model to study folding of deeply 
knotted proteins. The structure-based models favor formation of the native contacts but 
carry no topological bias. It is an open question how to implement such a bias in a model. 
Establishment of the native contacts usually does not mean establishing the knot. The 
nascent conditions are found to enhance the probability of establishing the contacts and 
sometimes also the formation of the knot. We also find that achieving proper folding 
requires staying within the proper range of temperatures. 

Our results are consistent with the experiments of Mallam et al. |36] that nascent 
proteins form knots more easily and that knots in 1J85 (YibK) persist in the chemically 
denatured state m- Independent of this, the GroEL-GroES chaperonin complex may 
also accelerate knotted folding [36] . 

Our results for 1J85 support the slipknot mechanism in the knot formation but suggest 
that it may operate only under the nascent conditions and when the temperature is 
within an optimal range. The geometry of the ribosome tunnel should provide even 
more confinement than that given just by the plane and should boost the efficiency of 
cotranslational folding. Finally, there is no need to invoke non-native contacts to fold to 
the knotted state in this system. 

Proteins with shallow knots, such as MJ0366 from Methanocaldococcus jannaschii with 
the PDB structure code of 2EFV, studied theoretically in refs. |38l El] appear to fold in 
a different way both off- and on-ribosome. A discussion of this problem is being prepared 
for a separate report. 
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Figure 1: Panel A shows the native structure of protein 1J85. Panel B shows an example of the 
correctly folded state obtained through cotranslational evolution. Panel C shows an example 
of conformation with the native contacts all established but without the topology of the knot. 
The meanings of the colors are explained in the main text. 
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Figure 2: The temperature-dependence of Pq, Q, and tunf for 1J85, top to bottom panels 
correspondingly. The contact map used is indicated. The horizontal dotted lines in the top two 
panels correspond to the value of 
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Figure 3: Locations of the knot ends in 1J85 at various temperatures as indicated. 
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Figure 4: 1J85: the kinetic results for the on-ribosome folding. The upper panel shows the 
percentage-wise success in folding as a function of T for f^=5000 r. The inset shows S' as a 
function of tw for the optimal temperature of 0.35 e/kB- The dotted circle shows S when one 
contact, 88-121 is added. The full circle shows S when all specihc CSU contacts are added. 
The lower panel with its inset is similar but shows the data for misfolding - when the native 
contacts get established but the knot is not formed. 
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Figure 5: Stages of folding and knotting process of protein 1J85 in the presence of the repulsive 
wall. The wall is represented by the blue plate. The number written on the plate indicates the 
residue that has just been born. The part of the protein which creates the loop is marked in red 
color. Residue 121 is represented by a purple ball in panel C - it is the begining of a slipknot 
part. The arrows in panels C and D indicate the direction in which the remaining part of the 
backbone moves through the red loop. The completely knotted, folded and released protein is 
presented in panel E without the wall as the protein gets detached. 
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Figure 6: Conformation of the nascent 1J85 just before formation of the slipknot state. Residue 
121, represented by a purple ball, stays pinned to the shaded surface spanned by the red knot- 
loop. In order to form the knot, the remaining C-terminal part of the protein has to drag 
through the knot-loop in the direction shown by the arrow. 



Figure 7: The contacts involving residue 121 in 1J85. The contacts shown in the orange color 
are derived through the overlap criterion. The contact shown in the blue color is an extra 
contact derived through the CSU approach. All of these contacts are with the residues that are 
a part of the red knot-loop. 
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Figure 8: The fully formed slipknot state in 1J85: the purple part of the backbone just went 
through the red knot-loop. The purple spheres indicate the begining (121) and ending (141) 
residues in the slip-loop. 



Figure 9: Three successive stages of the successful knotting process of protein 1J85 in the 
presence of the repulsive wall. The upper panels show the potential energies associated with 
amino acid 125 in units of e, as indicated by the color scales on the right. The chain shows 
the projection of the knot-loop onto the x — y plane which is parallel to the repulsive plane. 
The origin (0,0) is set at location of amino acid 125. The lower panels show the corresponding 
conformations of the protein. The thick arrow next to C indicate the sequential direction toward 
the C terminus which is not yet born. The dotted arrows indicate the direction of movement 
of amino acid 125. Panel (a) is for the situation in which the slipknot is about to be formed 
and panel (c) just after it was fully formed. 
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